Dataset statistics
| Number of variables | 25 |
|---|---|
| Number of observations | 138556 |
| Missing cells | 137135 |
| Missing cells (%) | 4.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 26.4 MiB |
| Average record size in memory | 200.0 B |
Variable types
| CAT | 17 |
|---|---|
| NUM | 8 |
DOB has a high cardinality: 900 distinct values | High cardinality |
DOD has 137135 (99.0%) missing values | Missing |
BID has unique values | Unique |
County has 2977 (2.1%) zeros | Zeros |
InpatientAnnualReimbursementAmt has 102511 (74.0%) zeros | Zeros |
InpatientAnnualDeductibleAmt has 102019 (73.6%) zeros | Zeros |
OutpatientAnnualReimbursementAmt has 4205 (3.0%) zeros | Zeros |
OutpatientAnnualDeductibleAmt has 13890 (10.0%) zeros | Zeros |
Reproduction
| Analysis started | 2020-10-13 16:28:03.353280 |
|---|---|
| Analysis finished | 2020-10-13 16:28:43.920685 |
| Duration | 40.57 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
| Distinct | 138556 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| BENE147419 | 1 |
|---|---|
| BENE131588 | 1 |
| BENE156585 | 1 |
| BENE134117 | 1 |
| BENE49085 | 1 |
| Other values (138551) |
| Value | Count | Frequency (%) | |
| BENE147419 | 1 | < 0.1% | |
| BENE131588 | 1 | < 0.1% | |
| BENE156585 | 1 | < 0.1% | |
| BENE134117 | 1 | < 0.1% | |
| BENE49085 | 1 | < 0.1% | |
| BENE151525 | 1 | < 0.1% | |
| BENE156611 | 1 | < 0.1% | |
| BENE45480 | 1 | < 0.1% | |
| BENE38765 | 1 | < 0.1% | |
| BENE93365 | 1 | < 0.1% | |
| Other values (138546) | 138546 | > 99.9% |
Unique
| Unique | 138556 ? |
|---|---|
| Unique (%) | 100.0% |
Length
| Max length | 10 |
|---|---|
| Median length | 9 |
| Mean length | 9.39984555 |
| Min length | 9 |
| Distinct | 900 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 1939-10-01 | 540 |
|---|---|
| 1941-10-01 | 538 |
| 1939-03-01 | 535 |
| 1940-03-01 | 526 |
| 1939-04-01 | 517 |
| Other values (895) |
| Value | Count | Frequency (%) | |
| 1939-10-01 | 540 | 0.4% | |
| 1941-10-01 | 538 | 0.4% | |
| 1939-03-01 | 535 | 0.4% | |
| 1940-03-01 | 526 | 0.4% | |
| 1939-04-01 | 517 | 0.4% | |
| 1941-05-01 | 513 | 0.4% | |
| 1943-12-01 | 512 | 0.4% | |
| 1941-12-01 | 512 | 0.4% | |
| 1942-12-01 | 509 | 0.4% | |
| 1943-11-01 | 509 | 0.4% | |
| Other values (890) | 133345 | 96.2% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
| Distinct | 11 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 137135 |
| Missing (%) | 99.0% |
| Memory size | 1.1 MiB |
| 2009-12-01 | |
|---|---|
| 2009-10-01 | |
| 2009-09-01 | |
| 2009-11-01 | |
| 2009-08-01 | |
| Other values (6) |
| Value | Count | Frequency (%) | |
| 2009-12-01 | 182 | 0.1% | |
| 2009-10-01 | 168 | 0.1% | |
| 2009-09-01 | 164 | 0.1% | |
| 2009-11-01 | 149 | 0.1% | |
| 2009-08-01 | 144 | 0.1% | |
| 2009-07-01 | 141 | 0.1% | |
| 2009-05-01 | 119 | 0.1% | |
| 2009-06-01 | 119 | 0.1% | |
| 2009-04-01 | 94 | 0.1% | |
| 2009-03-01 | 91 | 0.1% | |
| (Missing) | 137135 | 99.0% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 10 |
|---|---|
| Median length | 3 |
| Mean length | 3.071790467 |
| Min length | 3 |
Gender
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 2 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 2 | 79106 | 57.1% | |
| 1 | 59450 | 42.9% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Race
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 1 | |
|---|---|
| 2 | |
| 3 | 5059 |
| 5 | 2902 |
| Value | Count | Frequency (%) | |
| 1 | 117057 | 84.5% | |
| 2 | 13538 | 9.8% | |
| 3 | 5059 | 3.7% | |
| 5 | 2902 | 2.1% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
RenalDisease
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 0 | |
|---|---|
| Y |
| Value | Count | Frequency (%) | |
| 0 | 118978 | 85.9% | |
| Y | 19578 | 14.1% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
State
Real number (ℝ≥0)
| Distinct | 52 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 25.66673403 |
|---|---|
| Minimum | 1 |
| Maximum | 54 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.1 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 4 |
| Q1 | 11 |
| median | 25 |
| Q3 | 39 |
| 95-th percentile | 50 |
| Maximum | 54 |
| Range | 53 |
| Interquartile range (IQR) | 28 |
Descriptive statistics
| Standard deviation | 15.22344304 |
|---|---|
| Coefficient of variation (CV) | 0.5931196007 |
| Kurtosis | -1.249185339 |
| Mean | 25.66673403 |
| Median Absolute Deviation (MAD) | 14 |
| Skewness | 0.08072830774 |
| Sum | 3556280 |
| Variance | 231.7532179 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 5 | 12052 | 8.7% | |
| 10 | 9771 | 7.1% | |
| 45 | 8780 | 6.3% | |
| 33 | 8443 | 6.1% | |
| 39 | 6055 | 4.4% | |
| 14 | 5923 | 4.3% | |
| 36 | 5366 | 3.9% | |
| 23 | 5293 | 3.8% | |
| 34 | 4629 | 3.3% | |
| 31 | 4124 | 3.0% | |
| Other values (42) | 68120 | 49.2% |
| Value | Count | Frequency (%) | |
| 1 | 2615 | 1.9% | |
| 2 | 196 | 0.1% | |
| 3 | 2395 | 1.7% | |
| 4 | 1817 | 1.3% | |
| 5 | 12052 | 8.7% |
| Value | Count | Frequency (%) | |
| 54 | 1237 | 0.9% | |
| 53 | 295 | 0.2% | |
| 52 | 2662 | 1.9% | |
| 51 | 1212 | 0.9% | |
| 50 | 2793 | 2.0% |
| Distinct | 314 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 374.4247452 |
|---|---|
| Minimum | 0 |
| Maximum | 999 |
| Zeros | 2977 |
| Zeros (%) | 2.1% |
| Memory size | 1.1 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 20 |
| Q1 | 141 |
| median | 340 |
| Q3 | 570 |
| 95-th percentile | 881 |
| Maximum | 999 |
| Range | 999 |
| Interquartile range (IQR) | 429 |
Descriptive statistics
| Standard deviation | 266.2775811 |
|---|---|
| Coefficient of variation (CV) | 0.711164485 |
| Kurtosis | -0.7522266009 |
| Mean | 374.4247452 |
| Median Absolute Deviation (MAD) | 200 |
| Skewness | 0.4671250956 |
| Sum | 51878795 |
| Variance | 70903.75021 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 200 | 3943 | 2.8% | |
| 10 | 3587 | 2.6% | |
| 20 | 3176 | 2.3% | |
| 60 | 3003 | 2.2% | |
| 0 | 2977 | 2.1% | |
| 90 | 2833 | 2.0% | |
| 470 | 2768 | 2.0% | |
| 400 | 2738 | 2.0% | |
| 160 | 2526 | 1.8% | |
| 150 | 2411 | 1.7% | |
| Other values (304) | 108594 | 78.4% |
| Value | Count | Frequency (%) | |
| 0 | 2977 | 2.1% | |
| 1 | 3 | < 0.1% | |
| 10 | 3587 | 2.6% | |
| 11 | 64 | < 0.1% | |
| 14 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 999 | 264 | 0.2% | |
| 996 | 16 | < 0.1% | |
| 994 | 25 | < 0.1% | |
| 993 | 17 | < 0.1% | |
| 992 | 52 | < 0.1% |
NumOfMonths_PartACov
Real number (ℝ≥0)
| Distinct | 13 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 11.90772684 |
|---|---|
| Minimum | 0 |
| Maximum | 12 |
| Zeros | 1000 |
| Zeros (%) | 0.7% |
| Memory size | 1.1 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 12 |
| Q1 | 12 |
| median | 12 |
| Q3 | 12 |
| 95-th percentile | 12 |
| Maximum | 12 |
| Range | 12 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 1.032331747 |
|---|---|
| Coefficient of variation (CV) | 0.08669427511 |
| Kurtosis | 126.3832027 |
| Mean | 11.90772684 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -11.29222571 |
| Sum | 1649887 |
| Variance | 1.065708835 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 12 | 137389 | 99.2% | |
| 0 | 1000 | 0.7% | |
| 6 | 38 | < 0.1% | |
| 11 | 28 | < 0.1% | |
| 8 | 26 | < 0.1% | |
| 10 | 18 | < 0.1% | |
| 7 | 16 | < 0.1% | |
| 4 | 13 | < 0.1% | |
| 5 | 8 | < 0.1% | |
| 9 | 7 | < 0.1% | |
| Other values (3) | 13 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 1000 | 0.7% | |
| 1 | 3 | < 0.1% | |
| 2 | 5 | < 0.1% | |
| 3 | 5 | < 0.1% | |
| 4 | 13 | < 0.1% |
| Value | Count | Frequency (%) | |
| 12 | 137389 | 99.2% | |
| 11 | 28 | < 0.1% | |
| 10 | 18 | < 0.1% | |
| 9 | 7 | < 0.1% | |
| 8 | 26 | < 0.1% |
NumOfMonths_PartBCov
Real number (ℝ≥0)
| Distinct | 13 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 11.91014463 |
|---|---|
| Minimum | 0 |
| Maximum | 12 |
| Zeros | 675 |
| Zeros (%) | 0.5% |
| Memory size | 1.1 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 12 |
| Q1 | 12 |
| median | 12 |
| Q3 | 12 |
| 95-th percentile | 12 |
| Maximum | 12 |
| Range | 12 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.9368933355 |
|---|---|
| Coefficient of variation (CV) | 0.07866347254 |
| Kurtosis | 135.9775469 |
| Mean | 11.91014463 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -11.4782554 |
| Sum | 1650222 |
| Variance | 0.877769122 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 12 | 136902 | 98.8% | |
| 0 | 675 | 0.5% | |
| 6 | 282 | 0.2% | |
| 10 | 150 | 0.1% | |
| 11 | 143 | 0.1% | |
| 9 | 122 | 0.1% | |
| 8 | 71 | 0.1% | |
| 7 | 63 | < 0.1% | |
| 5 | 50 | < 0.1% | |
| 4 | 35 | < 0.1% | |
| Other values (3) | 63 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 675 | 0.5% | |
| 1 | 17 | < 0.1% | |
| 2 | 19 | < 0.1% | |
| 3 | 27 | < 0.1% | |
| 4 | 35 | < 0.1% |
| Value | Count | Frequency (%) | |
| 12 | 136902 | 98.8% | |
| 11 | 143 | 0.1% | |
| 10 | 150 | 0.1% | |
| 9 | 122 | 0.1% | |
| 8 | 71 | 0.1% |
Chronic_Alzheimer
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 2 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 2 | 92530 | 66.8% | |
| 1 | 46026 | 33.2% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Chronic_Heartfailure
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 2 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 2 | 70154 | 50.6% | |
| 1 | 68402 | 49.4% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Chronic_KidneyDisease
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 2 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 2 | 95277 | 68.8% | |
| 1 | 43279 | 31.2% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Chronic_Cancer
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 2 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 2 | 121935 | 88.0% | |
| 1 | 16621 | 12.0% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Chronic_ObstrPulmonary
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 2 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 2 | 105697 | 76.3% | |
| 1 | 32859 | 23.7% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Chronic_Depression
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 2 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 2 | 89296 | 64.4% | |
| 1 | 49260 | 35.6% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Chronic_Diabetes
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 1 | |
|---|---|
| 2 |
| Value | Count | Frequency (%) | |
| 1 | 83391 | 60.2% | |
| 2 | 55165 | 39.8% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Chronic_IschemicHeart
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 1 | |
|---|---|
| 2 |
| Value | Count | Frequency (%) | |
| 1 | 93644 | 67.6% | |
| 2 | 44912 | 32.4% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Chronic_Osteoporasis
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 2 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 2 | 100497 | 72.5% | |
| 1 | 38059 | 27.5% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Chronic_rheumatoidarthritis
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 2 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 2 | 102972 | 74.3% | |
| 1 | 35584 | 25.7% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Chronic_stroke
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 2 | |
|---|---|
| 1 | 10954 |
| Value | Count | Frequency (%) | |
| 2 | 127602 | 92.1% | |
| 1 | 10954 | 7.9% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
| Distinct | 3004 |
|---|---|
| Distinct (%) | 2.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3660.346502 |
|---|---|
| Minimum | -8000 |
| Maximum | 161470 |
| Zeros | 102511 |
| Zeros (%) | 74.0% |
| Memory size | 1.1 MiB |
Quantile statistics
| Minimum | -8000 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 2280 |
| 95-th percentile | 20260 |
| Maximum | 161470 |
| Range | 169470 |
| Interquartile range (IQR) | 2280 |
Descriptive statistics
| Standard deviation | 9568.621827 |
|---|---|
| Coefficient of variation (CV) | 2.614130061 |
| Kurtosis | 31.02521782 |
| Mean | 3660.346502 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 4.636541731 |
| Sum | 507162970 |
| Variance | 91558523.67 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 102511 | 74.0% | |
| 4000 | 2123 | 1.5% | |
| 5000 | 1851 | 1.3% | |
| 3000 | 1800 | 1.3% | |
| 6000 | 1589 | 1.1% | |
| 7000 | 1274 | 0.9% | |
| 8000 | 1230 | 0.9% | |
| 9000 | 1105 | 0.8% | |
| 10000 | 1085 | 0.8% | |
| 11000 | 1018 | 0.7% | |
| Other values (2994) | 22970 | 16.6% |
| Value | Count | Frequency (%) | |
| -8000 | 1 | < 0.1% | |
| -1400 | 1 | < 0.1% | |
| -1000 | 1 | < 0.1% | |
| -640 | 1 | < 0.1% | |
| -500 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 161470 | 1 | < 0.1% | |
| 155600 | 1 | < 0.1% | |
| 155270 | 1 | < 0.1% | |
| 153580 | 1 | < 0.1% | |
| 148580 | 1 | < 0.1% |
| Distinct | 147 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 399.8472964 |
|---|---|
| Minimum | 0 |
| Maximum | 38272 |
| Zeros | 102019 |
| Zeros (%) | 73.6% |
| Memory size | 1.1 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1068 |
| 95-th percentile | 2136 |
| Maximum | 38272 |
| Range | 38272 |
| Interquartile range (IQR) | 1068 |
Descriptive statistics
| Standard deviation | 956.1752023 |
|---|---|
| Coefficient of variation (CV) | 2.391350926 |
| Kurtosis | 268.1092213 |
| Mean | 399.8472964 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 10.45326205 |
| Sum | 55401242 |
| Variance | 914271.0176 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 102019 | 73.6% | |
| 1068 | 27113 | 19.6% | |
| 2136 | 6418 | 4.6% | |
| 3204 | 1481 | 1.1% | |
| 4272 | 369 | 0.3% | |
| 3068 | 98 | 0.1% | |
| 2068 | 87 | 0.1% | |
| 5340 | 79 | 0.1% | |
| 4068 | 67 | < 0.1% | |
| 5068 | 59 | < 0.1% | |
| Other values (137) | 766 | 0.6% |
| Value | Count | Frequency (%) | |
| 0 | 102019 | 73.6% | |
| 1068 | 27113 | 19.6% | |
| 1088 | 2 | < 0.1% | |
| 1098 | 4 | < 0.1% | |
| 1118 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 38272 | 1 | < 0.1% | |
| 37204 | 1 | < 0.1% | |
| 36136 | 3 | < 0.1% | |
| 35204 | 1 | < 0.1% | |
| 35068 | 4 | < 0.1% |
| Distinct | 2078 |
|---|---|
| Distinct (%) | 1.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1298.219348 |
|---|---|
| Minimum | -70 |
| Maximum | 102960 |
| Zeros | 4205 |
| Zeros (%) | 3.0% |
| Memory size | 1.1 MiB |
Quantile statistics
| Minimum | -70 |
|---|---|
| 5-th percentile | 20 |
| Q1 | 170 |
| median | 570 |
| Q3 | 1500 |
| 95-th percentile | 4370 |
| Maximum | 102960 |
| Range | 103030 |
| Interquartile range (IQR) | 1330 |
Descriptive statistics
| Standard deviation | 2493.901134 |
|---|---|
| Coefficient of variation (CV) | 1.921016766 |
| Kurtosis | 159.619928 |
| Mean | 1298.219348 |
| Median Absolute Deviation (MAD) | 480 |
| Skewness | 8.606026117 |
| Sum | 179876080 |
| Variance | 6219542.865 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 4205 | 3.0% | |
| 100 | 3916 | 2.8% | |
| 200 | 3153 | 2.3% | |
| 60 | 2694 | 1.9% | |
| 300 | 2280 | 1.6% | |
| 90 | 2178 | 1.6% | |
| 80 | 2087 | 1.5% | |
| 50 | 2086 | 1.5% | |
| 40 | 2048 | 1.5% | |
| 400 | 2045 | 1.5% | |
| Other values (2068) | 111864 | 80.7% |
| Value | Count | Frequency (%) | |
| -70 | 1 | < 0.1% | |
| -60 | 3 | < 0.1% | |
| -50 | 2 | < 0.1% | |
| -40 | 1 | < 0.1% | |
| -20 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 102960 | 1 | < 0.1% | |
| 101250 | 1 | < 0.1% | |
| 97510 | 1 | < 0.1% | |
| 94910 | 1 | < 0.1% | |
| 86980 | 1 | < 0.1% |
| Distinct | 789 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 377.7182583 |
|---|---|
| Minimum | 0 |
| Maximum | 13840 |
| Zeros | 13890 |
| Zeros (%) | 10.0% |
| Memory size | 1.1 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 40 |
| median | 170 |
| Q3 | 460 |
| 95-th percentile | 1340 |
| Maximum | 13840 |
| Range | 13840 |
| Interquartile range (IQR) | 420 |
Descriptive statistics
| Standard deviation | 645.5301866 |
|---|---|
| Coefficient of variation (CV) | 1.709025636 |
| Kurtosis | 47.74698455 |
| Mean | 377.7182583 |
| Median Absolute Deviation (MAD) | 150 |
| Skewness | 5.435348431 |
| Sum | 52335131 |
| Variance | 416709.2218 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 13890 | 10.0% | |
| 20 | 7271 | 5.2% | |
| 10 | 6140 | 4.4% | |
| 30 | 4755 | 3.4% | |
| 100 | 4743 | 3.4% | |
| 40 | 4312 | 3.1% | |
| 200 | 4043 | 2.9% | |
| 50 | 3577 | 2.6% | |
| 60 | 3266 | 2.4% | |
| 70 | 3081 | 2.2% | |
| Other values (779) | 83478 | 60.2% |
| Value | Count | Frequency (%) | |
| 0 | 13890 | 10.0% | |
| 10 | 6140 | 4.4% | |
| 20 | 7271 | 5.2% | |
| 30 | 4755 | 3.4% | |
| 40 | 4312 | 3.1% |
| Value | Count | Frequency (%) | |
| 13840 | 1 | < 0.1% | |
| 13040 | 1 | < 0.1% | |
| 12090 | 1 | < 0.1% | |
| 11800 | 1 | < 0.1% | |
| 11570 | 1 | < 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| BID | DOB | DOD | Gender | Race | RenalDisease | State | County | NumOfMonths_PartACov | NumOfMonths_PartBCov | Chronic_Alzheimer | Chronic_Heartfailure | Chronic_KidneyDisease | Chronic_Cancer | Chronic_ObstrPulmonary | Chronic_Depression | Chronic_Diabetes | Chronic_IschemicHeart | Chronic_Osteoporasis | Chronic_rheumatoidarthritis | Chronic_stroke | InpatientAnnualReimbursementAmt | InpatientAnnualDeductibleAmt | OutpatientAnnualReimbursementAmt | OutpatientAnnualDeductibleAmt | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | BENE11001 | 1943-01-01 | NaN | 1 | 1 | 0 | 39 | 230 | 12 | 12 | 1 | 2 | 1 | 2 | 2 | 1 | 1 | 1 | 2 | 1 | 1 | 36000 | 3204 | 60 | 70 |
| 1 | BENE11002 | 1936-09-01 | NaN | 2 | 1 | 0 | 39 | 280 | 12 | 12 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 0 | 0 | 30 | 50 |
| 2 | BENE11003 | 1936-08-01 | NaN | 1 | 1 | 0 | 52 | 590 | 12 | 12 | 1 | 2 | 2 | 2 | 2 | 2 | 2 | 1 | 2 | 2 | 2 | 0 | 0 | 90 | 40 |
| 3 | BENE11004 | 1922-07-01 | NaN | 1 | 1 | 0 | 39 | 270 | 12 | 12 | 1 | 1 | 2 | 2 | 2 | 2 | 1 | 1 | 1 | 1 | 2 | 0 | 0 | 1810 | 760 |
| 4 | BENE11005 | 1935-09-01 | NaN | 1 | 1 | 0 | 24 | 680 | 12 | 12 | 2 | 2 | 2 | 2 | 1 | 2 | 1 | 2 | 2 | 2 | 2 | 0 | 0 | 1790 | 1200 |
| 5 | BENE11006 | 1976-09-01 | NaN | 2 | 1 | 0 | 23 | 810 | 12 | 12 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 0 | 0 | 500 | 0 |
| 6 | BENE11007 | 1940-09-01 | 2009-12-01 | 1 | 2 | 0 | 45 | 610 | 12 | 12 | 1 | 1 | 2 | 2 | 2 | 2 | 1 | 2 | 1 | 1 | 2 | 0 | 0 | 1490 | 160 |
| 7 | BENE11008 | 1934-02-01 | NaN | 2 | 1 | 0 | 15 | 140 | 12 | 12 | 2 | 2 | 2 | 2 | 2 | 2 | 1 | 2 | 2 | 2 | 2 | 0 | 0 | 30 | 0 |
| 8 | BENE11009 | 1929-06-01 | NaN | 1 | 1 | Y | 44 | 230 | 12 | 12 | 2 | 1 | 2 | 2 | 2 | 2 | 1 | 2 | 2 | 2 | 2 | 0 | 0 | 100 | 0 |
| 9 | BENE11010 | 1936-07-01 | NaN | 2 | 1 | 0 | 41 | 30 | 12 | 12 | 2 | 1 | 2 | 1 | 1 | 2 | 1 | 1 | 1 | 2 | 2 | 0 | 0 | 1170 | 660 |
Last rows
| BID | DOB | DOD | Gender | Race | RenalDisease | State | County | NumOfMonths_PartACov | NumOfMonths_PartBCov | Chronic_Alzheimer | Chronic_Heartfailure | Chronic_KidneyDisease | Chronic_Cancer | Chronic_ObstrPulmonary | Chronic_Depression | Chronic_Diabetes | Chronic_IschemicHeart | Chronic_Osteoporasis | Chronic_rheumatoidarthritis | Chronic_stroke | InpatientAnnualReimbursementAmt | InpatientAnnualDeductibleAmt | OutpatientAnnualReimbursementAmt | OutpatientAnnualDeductibleAmt | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 138546 | BENE159188 | 1938-10-01 | NaN | 1 | 1 | Y | 31 | 150 | 12 | 12 | 2 | 1 | 2 | 1 | 2 | 2 | 1 | 1 | 2 | 2 | 2 | 15800 | 1068 | 6140 | 60 |
| 138547 | BENE159189 | 1941-04-01 | NaN | 2 | 1 | 0 | 18 | 361 | 12 | 12 | 2 | 2 | 2 | 1 | 2 | 2 | 2 | 1 | 1 | 2 | 2 | 0 | 0 | 1820 | 40 |
| 138548 | BENE159190 | 1939-11-01 | NaN | 1 | 1 | 0 | 29 | 10 | 12 | 12 | 2 | 1 | 1 | 2 | 2 | 2 | 1 | 1 | 2 | 2 | 1 | 0 | 0 | 160 | 0 |
| 138549 | BENE159191 | 1926-10-01 | NaN | 1 | 1 | 0 | 36 | 710 | 12 | 12 | 1 | 2 | 2 | 2 | 1 | 2 | 2 | 1 | 1 | 2 | 1 | 0 | 0 | 640 | 250 |
| 138550 | BENE159192 | 1937-04-01 | NaN | 2 | 1 | 0 | 21 | 30 | 12 | 12 | 2 | 2 | 2 | 2 | 2 | 2 | 1 | 1 | 2 | 1 | 2 | 0 | 0 | 420 | 100 |
| 138551 | BENE159194 | 1939-07-01 | NaN | 1 | 1 | 0 | 39 | 140 | 12 | 12 | 1 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 0 | 0 | 430 | 460 |
| 138552 | BENE159195 | 1938-12-01 | NaN | 2 | 1 | 0 | 49 | 530 | 12 | 12 | 1 | 2 | 2 | 2 | 2 | 2 | 1 | 2 | 2 | 2 | 2 | 0 | 0 | 880 | 100 |
| 138553 | BENE159196 | 1916-06-01 | NaN | 2 | 1 | 0 | 6 | 150 | 12 | 12 | 2 | 1 | 1 | 1 | 2 | 1 | 1 | 1 | 2 | 2 | 2 | 2000 | 1068 | 3240 | 1390 |
| 138554 | BENE159197 | 1930-01-01 | NaN | 1 | 1 | 0 | 16 | 560 | 12 | 12 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 1 | 2 | 2 | 2 | 0 | 0 | 2650 | 10 |
| 138555 | BENE159198 | 1952-04-01 | NaN | 2 | 1 | 0 | 21 | 20 | 12 | 12 | 1 | 1 | 2 | 2 | 2 | 1 | 1 | 2 | 2 | 1 | 2 | 0 | 0 | 5470 | 1870 |